Goto

Collaborating Authors

 human factor


GRAPHIC--Guidelines for Reviewing Algorithmic Practices in Human-centred Design and Interaction for Creativity

Martins, Joana Rovira, Martins, Pedro, Boavida, Ana

arXiv.org Artificial Intelligence

Artificial Intelligence (AI) has been increasingly applied to creative domains, leading to the development of systems that collaborate with humans in design processes. In Graphic Design, integrating computational systems into co-creative workflows presents specific challenges, as it requires balancing scientific rigour with the subjective and visual nature of design practice. Following the PRISMA methodology, we identified 872 articles, resulting in a final corpus of 71 publications describing 68 unique systems. Based on this review, we introduce GRAPHIC (Guidelines for Reviewing Algorithmic Practices in Human-centred Design and Interaction for Creativity), a framework for analysing computational systems applied to Graphic Design. Its goal is to understand how current systems support human-AI collaboration in the Graphic Design discipline. The framework comprises main dimensions, which our analysis revealed to be essential across diverse system types: (1) Collaborative Panorama, (2) Processes and Modalities, and (3) Graphic Design Principles. Its application revealed research gaps, including the need to balance initiative and control between agents, improve communication through explainable interaction models, and promote systems that support transformational creativity grounded in core design principles.


AnimAgents: Coordinating Multi-Stage Animation Pre-Production with Human-Multi-Agent Collaboration

Wang, Wen-Fan, Lu, Chien-Ting, Ng, Jin Ping, Chiu, Yi-Ting, Lee, Ting-Ying, Wang, Miaosen, Chen, Bing-Yu, Chen, Xiang 'Anthony'

arXiv.org Artificial Intelligence

Animation pre-production lays the foundation of an animated film by transforming initial concepts into a coherent blueprint across interdependent stages such as ideation, scripting, design, and storyboarding. While generative AI tools are increasingly adopted in this process, they remain isolated, requiring creators to juggle multiple systems without integrated workflow support. Our formative study with 12 professional creative directors and independent animators revealed key challenges in their current practice: Creators must manually coordinate fragmented outputs, manage large volumes of information, and struggle to maintain continuity and creative control between stages. Based on the insights, we present AnimAgents, a human-multi-agent collaborative system that coordinates complex, multi-stage workflows through a core agent and specialized agents, supported by dedicated boards for the four major stages of pre-production. AnimAgents enables stage-aware orchestration, stage-specific output management, and element-level refinement, providing an end-to-end workflow tailored to professional practice. In a within-subjects summative study with 16 professional creators, AnimAgents significantly outperformed a strong single-agent baseline that equipped with advanced parallel image generation in coordination, consistency, information management, and overall satisfaction (p < .01). A field deployment with 4 creators further demonstrated AnimAgents' effectiveness in real-world projects.


Mutual Wanting in Human--AI Interaction: Empirical Evidence from Large-Scale Analysis of GPT Model Transitions

Shang, HaoYang, Liu, Xuan

arXiv.org Artificial Intelligence

The rapid evolution of large language models (LLMs) creates complex bidirectional expectations between users and AI systems that are poorly understood. We introduce the concept of "mutual wanting" to analyze these expectations during major model transitions. Through analysis of user comments from major AI forums and controlled experiments across multiple OpenAI models, we provide the first large-scale empirical validation of bidirectional desire dynamics in human-AI interaction. Our findings reveal that nearly half of users employ anthropomorphic language, trust significantly exceeds betrayal language, and users cluster into distinct "mutual wanting" types. We identify measurable expectation violation patterns and quantify the expectation-reality gap following major model releases. Using advanced NLP techniques including dual-algorithm topic modeling and multi-dimensional feature extraction, we develop the Mutual Wanting Alignment Framework (M-WAF) with practical applications for proactive user experience management and AI system design. These findings establish mutual wanting as a measurable phenomenon with clear implications for building more trustworthy and relationally-aware AI systems.


Alignment Debt: The Hidden Work of Making AI Usable

Oyemike, Cumi, Akpan, Elizabeth, Hervé-Berdys, Pierre

arXiv.org Artificial Intelligence

Frontier LLMs are optimised around high-resource assumptions about language, knowledge, devices, and connectivity. Whilst widely accessible, they often misfit conditions in the Global South. As a result, users must often perform additional work to make these systems usable. We term this alignment debt: the user-side burden that arises when AI systems fail to align with cultural, linguistic, infrastructural, or epistemic contexts. We develop and validate a four-part taxonomy of alignment debt through a survey of 411 AI users in Kenya and Nigeria. Among respondents measurable on this taxonomy (n = 385), prevalence is: Cultural and Linguistic (51.9%), Infrastructural (43.1%), Epistemic (33.8%), and Interaction (14.0%). Country comparisons show a divergence in Infrastructural and Interaction debt, challenging one-size-fits-Africa assumptions. Alignment debt is associated with compensatory labour, but responses vary by debt type: users facing Epistemic challenges verify outputs at significantly higher rates (91.5% vs. 80.8%; p = 0.037), and verification intensity correlates with cumulative debt burden (Spearmans rho = 0.147, p = 0.004). In contrast, Infrastructural and Interaction debts show weak or null associations with verification, indicating that some forms of misalignment cannot be resolved through verification alone. These findings show that fairness must be judged not only by model metrics but also by the burden imposed on users at the margins, compelling context-aware safeguards that alleviate alignment debt in Global South settings. The alignment debt framework provides an empirically grounded way to measure user burden, informing both design practice and emerging African AI governance efforts.


Preview, Accept or Discard? A Predictive Low-Motion Interaction Paradigm

Berengueres, Jose

arXiv.org Artificial Intelligence

Repetitive strain injury (RSI) affects roughly one in five computer users and remains largely unresolved despite decades of ergonomic mouse redesign. All such devices share a fundamental limitation: they still require fine-motor motion to operate. This work investigates whether predictive, AI-assisted input can reduce that motion by replacing physical pointing with ranked on-screen suggestions. To preserve user agency, we introduce Preview Accept Discard (PAD), a zero-click interaction paradigm that lets users preview predicted GUI targets, cycle through a small set of ranked alternatives, and accept or discard them via key-release timing. We evaluate PAD in two settings: a browser-based email client and a ISO 9241-9 keyboard-prediction task under varying top-3 accuracies. Across both studies, PAD substantially reduces hand motion relative to trackpad use while maintaining comparable task times with the trackpad only when accuracies are similar to those of the best spell-checkers.


Lived Experience in Dialogue: Co-designing Personalization in Large Language Models to Support Youth Mental Well-being

Guan, Kathleen W., Giri, Sarthak, Amara, Mohammed, Jansen, Bernard J., Liscio, Enrico, Esherick, Milena, Owayyed, Mohammed Al, Ratkute, Ausrine, Sedrakyan, Gayane, de Reuver, Mark, Goncalves, Joao Fernando Ferreira, Figueroa, Caroline A.

arXiv.org Artificial Intelligence

We conducted three 90 - minute workshops at Talenthub Op Zuid, each with a different group of participants (total N=24, MAge =17.6, SD=1.2, see S upplement for additional details). In the first workshop, participants reviewed the prior 13 personas from Stage 1 and critiqued them for gaps in relevance. The scoping personas generated from survey and forum data gave youth stakeholders a concrete starting point for consulting as experts by experience in initial co - design activities. They challenged the realism of the scoping personas . Using fill - in - the - blank templates to guide but not restrict their persona creation (created by a youth member of the research team with design training, see Supplement), youth added contextual details to the project personas, such as daily routines, stressors, and digital habits, and brainstormed plausible backstories involving bullying, school difficulties, or parental conflict. The second workshop engaged a new participant group who expanded on previous outputs and addressed additional questions on living environment and emotional support needs, as this was suggested as relevant by youth from the prior workshop . Participants revised or created new personas b ased on their own or peers' experiences. In t he third workshop, a new group of participants again reviewed prior co - creation and outputs and further refined the personas .


Race and Gender in LLM-Generated Personas: A Large-Scale Audit of 41 Occupations

van der Linden, Ilona, Kumar, Sahana, Dixit, Arnav, Sudan, Aadi, Danda, Smruthi, Anastasiu, David C., Lukoff, Kai

arXiv.org Artificial Intelligence

Generative AI tools are increasingly used to create portrayals of people in occupations, raising concerns about how race and gender are represented. We conducted a large-scale audit of over 1.5 million occupational personas across 41 U.S. occupations, generated by four large language models with different AI safety commitments and countries of origin (U.S., China, France). Compared with Bureau of Labor Statistics data, we find two recurring patterns: systematic shifts, where some groups are consistently under- or overrepresented, and stereotype exaggeration, where existing demographic skews are amplified. On average, White (--31pp) and Black (--9pp) workers are underrepresented, while Hispanic (+17pp) and Asian (+12pp) workers are overrepresented. These distortions can be extreme: for example, across all four models, Housekeepers are portrayed as nearly 100\% Hispanic, while Black workers are erased from many occupations. For HCI, these findings show provider choice materially changes who is visible, motivating model-specific audits and accountable design practices.


Towards Better Health Conversations: The Benefits of Context-seeking

Sayres, Rory, Hao, Yuexing, Ward, Abbi, Wang, Amy, Freeman, Beverly, Zhan, Serena, Ardila, Diego, Li, Jimmy, Lee, I-Ching, Iurchenko, Anna, Kou, Siyi, Badola, Kartikeya, Hu, Jimmy, Kumar, Bhawesh, Johnson, Keith, Vijay, Supriya, Krogue, Justin, Hassidim, Avinatan, Matias, Yossi, Webster, Dale R., Virmani, Sunny, Liu, Yun, Duong, Quang, Schaekermann, Mike

arXiv.org Artificial Intelligence

Navigating health questions can be daunting in the modern information landscape. Large language models (LLMs) may provide tailored, accessible information, but also risk being inaccurate, biased or misleading. We present insights from 4 mixed-methods studies (total N=163), examining how people interact with LLMs for their own health questions. Qualitative studies revealed the importance of context-seeking in conversational AIs to elicit specific details a person may not volunteer or know to share. Context-seeking by LLMs was valued by participants, even if it meant deferring an answer for several turns. Incorporating these insights, we developed a "Wayfinding AI" to proactively solicit context. In a randomized, blinded study, participants rated the Wayfinding AI as more helpful, relevant, and tailored to their concerns compared to a baseline AI. These results demonstrate the strong impact of proactive context-seeking on conversational dynamics, and suggest design patterns for conversational AI to help navigate health topics.


Plural Voices, Single Agent: Towards Inclusive AI in Multi-User Domestic Spaces

Chandra, Joydeep, Navneet, Satyam Kumar

arXiv.org Artificial Intelligence

Domestic AI agents faces ethical, autonomy, and inclusion challenges, particularly for overlooked groups like children, elderly, and Neurodivergent users. We present the Plural Voices Model (PVM), a novel single-agent framework that dynamically negotiates multi-user needs through real-time value alignment, leveraging diverse public datasets on mental health, eldercare, education, and moral reasoning. Using human+synthetic curriculum design with fairness-aware scenarios and ethical enhancements, PVM identifies core values, conflicts, and accessibility requirements to inform inclusive principles. Our privacy-focused prototype features adaptive safety scaffolds, tailored interactions (e.g., step-by-step guidance for Neurodivergent users, simple wording for children), and equitable conflict resolution. In preliminary evaluations, PVM outperforms multi-agent baselines in compliance (76% vs. 70%), fairness (90% vs. 85%), safety-violation rate (0% vs. 7%), and latency. Design innovations, including video guidance, autonomy sliders, family hubs, and adaptive safety dashboards, demonstrate new directions for ethical and inclusive domestic AI, for building user-centered agentic systems in plural domestic contexts. Our Codes and Model are been open sourced, available for reproduction: https://github.com/zade90/Agora


Attention to Non-Adopters

Zhou, Kaitlyn, Gligorić, Kristina, Cheng, Myra, Lam, Michelle S., Raman, Vyoma, Aminu, Boluwatife, Woo, Caeley, Brockman, Michael, Cha, Hannah, Jurafsky, Dan

arXiv.org Artificial Intelligence

Although language model-based chat systems are increasingly used in daily life, most Americans remain non-adopters of chat-based LLMs -- as of June 2025, 66% had never used ChatGPT. At the same time, LLM development and evaluation rely mainly on data from adopters (e.g., logs, preference data), focusing on the needs and tasks for a limited demographic group of adopters in terms of geographic location, education, and gender. In this position paper, we argue that incorporating non-adopter perspectives is essential for developing broadly useful and capable LLMs. We contend that relying on methods that focus primarily on adopters will risk missing a range of tasks and needs prioritized by non-adopters, entrenching inequalities in who benefits from LLMs, and creating oversights in model development and evaluation. To illustrate this claim, we conduct case studies with non-adopters and show: how non-adopter needs diverge from those of current users, how non-adopter needs point us towards novel reasoning tasks, and how to systematically integrate non-adopter needs via human-centered methods.